Apparatus and method for embedding and extracting information in analog signals using distributed signal features and replica modulation

ABSTRACT

Apparatus and methods are provided for embedding or embedding digital data into an analog host or cover signal. A distributed signal feature of the cover signal in a particular domain (time, frequency or space) is calculated and compared with a set of predefined quantization values corresponding to an information symbol to be encoded. The amount of change required to modify the signal feature to the determined target quantization value is calculated and the cover signal is modified accordingly to so change the feature value over a predefined interval. Information symbols are extracted by the opposite process. In one embodiment, the predefined value is a short term autocorrelation value of the cover signal.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a continuation of application with Ser. No.13/315,595, filed Dec. 9, 2011, which is a continuation of applicationwith Ser. No. 12/426,158, filed Apr. 17, 2009, now U.S. Pat. No.8,085,935, which is a continuation of application with Ser. No.10/763,288, filed Jan. 26, 2004, now U.S. Pat. No. 7,606,366, which is acontinuation of application with Ser. No. 10/206,826, filed Jul. 29,2002, now U.S. Pat. No. 6,683,958, which is a continuation ofapplication with Ser. No. 09/106,213, filed Jun. 29, 1998, now U.S. Pat.No. 6,427,012, which is a continuation-in-part of application with Ser.No. 08/974,920, filed Nov. 20, 1997, now U.S. Pat. No. 6,175,627, andwhich also is a continuation-in-part of application with Ser. No.08/858,562, filed May 19, 1997, now U.S. Pat. No. 5,940,135. The entirecontent of the before-mentioned patent applications are incorporated byreference as part of the disclosure of this application.

BACKGROUND OF THE INVENTION

1. Field of the Invention

This invention relates to apparatus and methods for encoding anddecoding information in analog signals, such as audio, video and datasignals, either transmitted by radio wave transmission or wiredtransmission, or stored in a recording medium such as optical ormagnetic disks, magnetic tape, or solid state memory.

2. Background and Description of Related Art

An area of particular interest to certain embodiments of the presentinvention relates to the market for musical recordings. Currently, alarge number of people listen to musical recordings on radio ortelevision. They often hear a recording which they like enough topurchase, but don't know the name of the song, the artist performing it,or the record, tape, or CD album of which it is part. As a result, thenumber of recordings which people purchase is less than it otherwisewould be if there was a simple way for people to identify which of therecordings that they hear on the radio or TV they wish to purchase.

Another area of interest to certain embodiments of the invention is copycontrol. There is currently a large market for audio software products,such as musical recordings. One of the problems in this market is theease of copying such products without paying those who produce them.This problem is becoming particularly troublesome with the advent ofrecording techniques, such as digital audio tape (DAT), which make itpossible for copies to be of very high quality. Thus it would bedesirable to develop a scheme which would prevent the unauthorizedcopying of audio recordings, including the unauthorized copying of audioworks broadcast over the airwaves. It is also desirable for copyrightenforcement to be able to insert into program material such as audio orvideo signals digital copyright information identifying the copyrightholder, which information may be detected by appropriate apparatus toidentify the copyright owner of the program, while remainingimperceptible to the listener or viewer.

Various prior art methods of encoding additional information onto asource signal are known. For example, it is known to pulse-widthmodulate a signal to provide a common or encoded signal carrying atleast two information portions or other useful portions. In U.S. Pat.No. 4,497,060 to Yang (1985) binary data is transmitted as a signalhaving two differing pulse-widths to represent logical “0” and “1”(e.g., the pulse-width durations for a “1” are twice the duration for a“0”). This correspondence also enables the determination of a clockingsignal.

U.S. Pat. No. 4,937,807 to Weitz et al. (1990) discloses a method andapparatus for encoding signals for producing sound transmissions withdigital information to enable addressing the stored representation ofsuch signals. Specifically, the apparatus in Weitz et al. converts ananalog signal for producing such sound transmissions to clocked digitalsignals comprising for each channel an audio data stream, a step-sizestream and an emphasis stream.

With respect to systems in which audio signals produce audiotransmissions, U.S. Pat. No. 4,876,617 to Best et al. (1989) and U.S.Pat. No. 5,113,437 to Best et al. (1992) disclose encoders for formingrelatively thin and shallow (e.g., 150 Hz wide and 50 dB deep) notchesin mid-range frequencies of an audio signal. The earlier of thesepatents discloses paired notch filters centered about the 2883 Hz and3417 Hz frequencies; the later patent discloses notch filters but withrandomly varying frequency pairs to discourage erasure or inhibitfiltering of the information added to the notches. The encoders then adddigital information in the form of signals in the lower frequencyindicating a “0” and in the higher frequency a “1”. In the later Best etal. patent an encoder samples the audio signal, delays the signal whilecalculating the signal level, and determines during the delay whether ornot to add the data signal and, if so, at what signal level. The laterBest et al. patent also notes that the “pseudo-random manner” in movingthe notches makes the data signals more difficult to detect audibly.

Other prior art techniques employ the psychoacoustic model of the humanperception characteristic to insert modulated or unmodulated tones intoa host signal such that they will be masked by existing signalcomponents and thus not perceived. See, e.g. Preuss et al., U.S. Pat.No. 5,319,735, and Jensen et al., U.S. Pat. No. 5,450,490. Suchtechniques are very expensive and complicated to implement, whilesuffering from a lack of robustness in the face of signal distortionsimposed by perception-based compression schemes designed to eliminatemasked signal components.

U.S. Pat. No. 5,613,004 to Cooperman et al. discloses a method fordetermining where to encode additional information into a stream ofdigital samples, wherein two pseudorandom keys are used to determineinto which frequency bins of the digital data stream the additionalinformation is to be encoded. A primary key has a number of bits equalto the sample window size. A secondary key or convolution mask has anarbitrary number of bits as a time mask, with each bit corresponding toa window. For each window, an encoder proceeds through each frequencybin, taking the corresponding bit of the primary key or mask and the bitof the convolutional mask corresponding to the window, and subjectingthose bits to a boolean operation to determine whether or not the bin isto be used in the encoding process to encode the bits of the additionalinformation message. When the last frequency bin in the window isprocessed, the next bit of the convolutional mask is retrieved and theprimary mask is reset to the first bit. When the last windowcorresponding to the last bit of the convolutional mask is reached, theconvolutional mask is reset to the first bit. Cooperman does notdescribe any specific method for the actual encoding of the additionalinformation bits into the digital stream.

The prior art fails to provide a method and an apparatus for encodingand decoding auxiliary analog or digital information signals onto analogaudio or video frequency signals for producing humanly perceivedtransmissions (i.e., sounds or images) such that the audio or videofrequency signals produce substantially identical humanly perceivedtransmission prior to as well as after encoding with the auxiliarysignals. The prior art also fails to provide relatively simple apparatusand methods for encoding and decoding audio or video frequency signalsfor producing humanly perceived audio transmissions with signalsdefining digital information. The prior art also fails to disclose amethod and apparatus for limiting unauthorized copying of audio or videofrequency signals for producing humanly perceived audio transmissions.

SUMMARY OF THE INVENTION

The present invention provides apparatus and methods for embedding orencoding, and extracting or decoding, digitized information in an analoghost or cover signal in a way which has minimal impact on the perceptionof the source information when the analog signal is applied to anappropriate output device, such as a speaker, a display monitor, orother electrical/electronic device.

The present invention further provides apparatus and methods forembedding and extracting machine readable signals in an analog coversignal which control the ability of a device to copy the cover signal.

In summary, the present invention provides for the encoding or embeddingof a data signal in an analog host or cover signal, by modulating thehost or cover signal so as to modify a distributed feature of the signalwithin the predefined region. The distributed feature of the host signalis modified to a predefined quantization value which corresponds to adata symbol or binary digit of the data signal to be embedded.Subsequently, the embedded data signal is recovered by detecting themodified distributed feature values and correlating the detected valueswith the predefined relationship between data symbols and quantizeddistributed feature values.

The term cover signal as used hereinafter refers to a host or sourcesignal, such as an audio, video or other information signal, whichcarries or is intended to carry embedded or hidden digitized data. Theterms distributed feature or signal feature as used hereinafter refer toa scalar value obtained by processing the cover signal values over thetotality of the regions within domains (i.e., time, frequency and/orspace) where the data-embedding modulation is applied. One desirableproperty for such processing is that random changes in signal magnitudescaused by noise or other signal distortions have a minimal effect on thesignal feature value, while the combined effect of modulation of signalmagnitudes for embedding of digitized data over a predefined regionproduces a measurable change in the feature value.

In particular, the present invention provides a method for embedding aninformation symbol in an analog cover signal, comprising the steps ofcalculating a distributed signal feature value of the cover signal overa predefined region, comparing the calculated signal feature value witha predefined set of quantization values corresponding to giveninformation symbols and determining a target quantization valuecorresponding to the information symbol to be embedded, calculating theamount of change required in the cover signal to modify the calculatedsignal feature to the target quantization value, and modifying the coversignal according to the calculated amount of change.

According to another aspect of the invention, a method is provided forextracting an information symbol embedded in an analog cover signal,comprising the steps of calculating a distributed signal feature valueof the cover signal over a predefined region, comparing the calculatedsignal feature value with a predefined set of quantization valuescorresponding to given information symbols and determining whichquantization value corresponds to the calculated signal feature value,and translating the determined quantization value into the informationsymbol contained in the cover signal and outputting the informationsymbol.

The present invention further provides apparatus for embeddinginformation in accordance with the above method, and apparatus forextracting the embedded information from the cover signal.

BRIEF DESCRIPTION OF THE DRAWINGS

These and other aspects of the present invention will become more fullyunderstood from the following detailed description of the preferredembodiments in conjunction with the accompanying drawings, in which:

FIG. 1 is a block diagram of an auxiliary information signal encodingand decoding process according to a first embodiment of the presentinvention;

FIG. 2 is a block diagram of one embodiment of the encoder 10 of FIG. 1;

FIG. 3 is a block diagram of one embodiment of the host modifying signalgenerator 11 of FIG. 2;

FIG. 4 is a block diagram of one embodiment of the host modifying signalcomponent generator 111 of FIG. 3;

FIG. 5 is a block diagram of an alternate host modifying signalgenerator according to the first embodiment of the present invention;

FIG. 6 is a block diagram of one embodiment of decoder 20 of FIG. 1;

FIG. 7 is a block diagram of short-term autocorrelation generator 21according to the first embodiment of the present invention;

FIG. 8 is a block diagram of an alternate decoder 20 of FIG. 1 accordingto the first embodiment of the present invention;

FIG. 9 is a block diagram of a data signal embedding and extractingcircuit according to a second embodiment of the present invention;

FIG. 10 is a block diagram of one embodiment of the embeddor 10 a ofFIG. 9;

FIG. 11 is a block diagram of one embodiment of the embedded signalgenerator 11 a of FIG. 10;

FIG. 12 is a block diagram of one embodiment of the data signalextractor 20 a of FIG. 9;

FIG. 13 is a table illustrating an example of specifications stego key 9used for embedding and extracting digital data in an audio signal,according to the second embodiment of the invention;

FIG. 14 is a block diagram of a second embodiment of the embedded signalgenerator 11 a of FIG. 10;

FIG. 15 is a block diagram of a second embodiment of the data signalextractor 20 a of FIG. 9, used with the embodiment FIG. 14;

FIG. 16 is a block diagram of one embodiment of a replica generatorwhich produces a cover signal replica shifted in frequency from theoriginal; and

FIGS. 17(A)-17(C) are graphs showing a set of orthogonal functions usedin the creation of an amplitude-shifted replica according to theembodiment of the present invention shown in FIGS. 14-16.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS

The present invention is directed to a method and apparatus forembedding information or data onto a cover signal, such as an audiosignal, video signal, or other analog signal, by modulating or changingthe value of a distributed feature of the cover signal in a selectedregion of the frequency, time and/or space domains of the cover signal.The information or data to be encoded is preferably a digital ordigitized signal. The invention can implemented in a number of differentways, either by software programming of a digital processor, in the formof analog, digital, or mixed-signal integrated circuits, as a discretecomponent electronic device, or a combination of such implementations.

According to a first preferred embodiment of the invention, a method andapparatus are provided for encoding auxiliary information onto a host orsource signal, such as an audio signal, video signal, or other datasignal, by modulating or changing the short-term autocorrelationfunction of the host signal as a function of the auxiliary informationover time, at one or more selected autocorrelation delays. The auxiliaryinformation may be an analog or digital signal. The short-termautocorrelation function is obtained by multiplying a signal with adelayed version of itself, and integrating the product over a predefinedintegration interval.

The short-term autocorrelation function is modulated or changed byadding to the host signal a host modifying signal having a positive ornegative correlation with the original host signal. The embedded signalis preferably a controllably attenuated version of the host signal whichhas been delayed or advanced (for purposes of the invention, an advancewill be considered a negative delay) in accordance with the selectedautocorrelation delay.

The autocorrelation function can be modulated using the entire hostsignal or only a portion of it. In the preferred embodiment, frequencybands, temporal and/or spatial regions of the host signal are chosen soas to minimize the disturbance to the host signal as it affects theperception of the signal's output (i.e., audio or video quality).

Multiple host modifying signal components can be added to the hostsignal in the same or different frequency bands and temporal and/orspatial regions by generating host modifying signal components withdifferent autocorrelation delays. The multiple host modifying signalcomponents can represent different auxiliary information to increaseoverall auxiliary information throughput, or can represent the sameauxiliary information to increase the robustness or security of theauxiliary information signal transmission.

Security is enhanced by maintaining confidential the informationconcerning specific parameters of the host modifying signal, which wouldbe known only to the encoder and decoder of the system. The hostmodifying signal components may also have autocorrelation delays whichvary over time according to a predetermined sequence or pattern,referred to herein as a “delay hopping pattern.”

First Embodiment

Referring now to the drawings, FIG. 1 shows a block diagram of theoverall system according to a first embodiment of the invention. Thesystem comprises an encoder 10 for encoding a host signal 2 (such as anaudio or video program or source signal) with an auxiliary informationsignal 6, to produce an encoded signal 4. The encoded signal 4 may betransmitted over a communication medium, channel or line, or may bestored on a storage medium such as magnetic tape, optical memory, solidstate memory, or electromagnetic memory, and also may be furtherprocessed such as by filtering, adaptive gain control, or other signalprocessing techniques, without impairing or degrading the encodedauxiliary information. The encoded signal 4 is then decoded in a decoder20 to retrieve the auxiliary information signal 6.

FIG. 2 shows a detail of a first implementation of the encoder 10 of thefirst embodiment in which the host signal is modified by a single hostmodifying signal 8, produced by a host modifying signal generator 11which receives the host signal 2 and the auxiliary information signal 6.The host modifying signal is added to the host signal in an adder 14 toprovide the encoded signal 4.

The host modifying signal is obtained as shown in FIG. 3, whichillustrates one embodiment of the host modifying signal generator 11. Inthis embodiment, the host signal 2 is filtered and/or masked by afilter/mask 110. The filter/mask 110 modifies the frequency, period, orspatial content of the host signal in such manner to cause minimaldisturbance to the output characteristics of the host signal whenapplied to an output device such as a speaker or a video monitor. It isalso possible for the filter/mask to pass the host signal unchanged, inwhich case the filtered/masked signal 3 would be equal to the hostsignal 2. The signal 3 is then inputted to a host modifying signalcomponent generator 111, wherein it is modified according to an inputauxiliary information signal 6, to produce a host modifying signal 8.The details of the host modifying signal component generator 111 areshown in FIG. 4.

As shown, the filtered host signal 3 is inputted to a delay/advancecircuit 1110 to produce a delayed/advanced signal 3 a. The signal 3 isalso inputted to a gain calculator 1112 along with auxiliary informationsignal 6. The purpose of the gain calculator 1112 is to calculate thegain of variable gain or attenuation circuit 1113 which is to be appliedto delayed signal 3 a in order to obtain the host modifying signal 8.The amount of delay (or advancement) applied by delay/advance circuit1110 corresponds to the autocorrelation delay at which the host signalis being modulated.

The amount of gain applied to the signal 3 a at any time or spatialregion is determined by the gain calculator 1112 as a function of thevalues of the auxiliary information signal 6 and the filtered signal 3.The short-term autocorrelation of the filtered signal 3 can be expressedby the formula

$\begin{matrix}{{R\left( {t,\tau} \right)} = {\int_{t - T}^{t}{{s(x)}{s\left( {x - \tau} \right)}\ {x}}}} & (1)\end{matrix}$

where s(t) is the filtered signal 3, R(t,τ) is the short-termautocorrelation of s(t), τ is the delay at which the autocorrelation isevaluated, T is the integration interval, and t is time.

By adding a host modifying signal e(t) to the filtered signal s(t), theautocorrelation function R(t, τ) is modulated to obtain a modulatedautocorrelation function R_(m)(t,τ):

$\begin{matrix}\begin{matrix}{{R_{m}\left( {t,\tau} \right)} = {\int_{t - T}^{t}{\left( {{s(x)} + {e(x)}} \right)\left( {{s\left( {x - \tau} \right)} + {e\left( {x - \tau} \right)}} \right)\ {x}}}} \\{= {{R\left( {t,\tau} \right)} +}} \\{{\int_{t - T}^{t}\left( {{s(x)} + {e\left( {x - \tau} \right)} + {{e(x)}\left( {{s\left( {x - \tau} \right)} + {{e(x)}{e\left( {x - \tau} \right)}}} \right)\ {x}}} \right.}}\end{matrix} & (2)\end{matrix}$

By appropriately selecting the host modifying signal e(t), an increaseor decrease of the short-term autocorrelation function can be achieved.It will be apparent that many different types of host modifying signalsmay be used to achieve this modulation. In the preferred embodiment,delayed or advanced versions of the host signal multiplied by a selectedamount of gain or attenuation are used as the host modifying signale(t). Specifically,

e(t)=gs(t−τ)  (3a)

or

e(t)=gs(t+τ)  (3b)

Substituting equations (3a) and (3b) respectively into equation (2), itis seen that the short-term autocorrelation of the resulting modifiedsignal can be written as

R _(m)(t,τ)=R(t,τ)+gR(t,2τ)+gR(t−τ,0)+g ² R(t−τ,τ)  (4a)

or

R _(m)(t,τ)=R(t,τ)+gR(t,0)+gR(t+τ,2τ)+g ² R(t+τ,τ)  (4b)

The autocorrelation functions R(t, τ) of the host signal which appear onthe right hand side of equations (4a) and (4b) can be measured, andtheir values used to obtain the solution for gain g that will produce adesired value for the modulated autocorrelation function R_(m)(t, τ). Itis typically desired to have small values for g so as to keep the hostmodifying signal transparent to the perceiver of the host signal. Ifthis is the case, the g² terms in equations (4a) and (4b) can be ignoredas negligible, such that the exact gain value can be closelyapproximated by

$\begin{matrix}{{g \approx \frac{{R_{m}\left( {t,\tau} \right)} - {R\left( {t,\tau} \right)}}{{R\left( {t,{2\; \tau}} \right)} + {R\left( {{t - \tau},0} \right)}}}{or}} & \left( {5\; a} \right) \\{g \approx \frac{{R_{m}\left( {t,\tau} \right)} - {R\left( {t,\tau} \right)}}{{R\left( {t,0} \right)} + {R\left( {{t + \tau},{2\; \tau}} \right)}}} & \left( {5\; b} \right)\end{matrix}$

respectively. While the present invention is equally applicable to theencoding of analog auxiliary information signals, the followingdiscussion assumes the auxiliary information signal is a digital signalhaving values taken from an M-ary set of symbols d_(i)ε{±1, ±3, . . .±(2M−1)}, for i=1, 2, 3, . . . which are transmitted at times t=iT_(s),where T_(s) denotes the symbol interval or period. According to thefirst preferred embodiment of the invention, each auxiliary informationsymbol is associated with a corresponding value of the short-termautocorrelation function. One way to map the symbols onto theautocorrelation function value domain while keeping the host modifyingsignal small with respect to the host signal, is to employ the formula

R _(m)(iT _(s),τ)=ξd _(i) R _(m)(iT _(s),0)  (6)

where ε is a small quantity selected to balance the requirement ofsignal robustness with the requirement that the host modifying signal betransparent to the perceiver. By inserting equations (4a) and (4b)respectively into equation (6), a quadratic equation for g is obtained,the solution of which provides the appropriate gain g, for the symboltransmitted at time t=iT_(s). Alternatively, approximate values for gcan be obtained using formulas (5a) or (5b). The gain is held constantover the symbol interval in order to minimize any errors. Furtherdeviation of g_(i) from its desired value can be used at the boundariesof the symbol interval to avoid abrupt changes in the host modifyingsignal which might jeopardize the requirement for host modifying signaltransparency. Modulation error caused by such smoothing does notsignificantly degrade the performance of the encoding system. Theintegration interval T should be shorter than T_(s)−τ in order tominimize intersymbol interference. However, certain overlap betweenadjacent symbols can be tolerated in order to increase the auxiliarychannel bandwidth.

In an alternative implementation, the gain calculator 1112 may map afixed gain to be applied to the filtered/masked and delayed/advancedsignal 3 a according to only the value of the auxiliary informationsignal 6. According to this implementation, the gain calculator ignoresthe value of the signal 3, and as such the input line for signal 3 maybe omitted. In this embodiment, the gain calculator will apply a fixedamount of gain depending on the value of the auxiliary signal 6. Forexample, in the instance where the auxiliary signal is a binary signal,the gain calculator could apply a predetermined positive gain for anauxiliary signal of “0” and a predetermined negative gain for anauxiliary signal of “1”. This approach will enable the encoder to havereduced complexity; however, it requires a larger modifying signal toobtain the same performance characteristics in terms of bit-error rateor signal robustness.

In order to recover the auxiliary information signal 6 from the encodedsignal 4, the encoded signal is applied to a decoder 20. Details of oneembodiment of the decoder 20 are shown in FIG. 6. According to thisembodiment, the decoder consists of a short-term autocorrelationgenerator 21 and an auxiliary signal extraction circuit 22. As shown inFIG. 7, the short-term autocorrelation generator 21 includes afilter/mask 210 which filters and/or masks the encoded signal 4, andthen obtains an autocorrelation signal by applying the filtered encodedsignal to a squaring circuit 212, a delay circuit 214, and a multiplier216. The output of the squaring circuit 212 and the output of themultiplier 216 are applied to short-term integrators 218 a and 218 b.The output of integrator 218 b is an autocorrelation signal 5. Theoutputs of integrators 218 a and 218 b are also applied to anormalization circuit 220, to produce a normalized autocorrelationsignal 5 a. The filter/mask 210 can have the same characteristics as thefilter/mask 110 of the encoder (or may be different), and in somecircumstances may be omitted entirely. The delay circuit 214 uses thesame delay τ as used in the delay/advance circuit 1110 of the encoder.The squaring circuit 212 calculates the square of the filtered encodedsignal, which is the same as calculating the short-term autocorrelationwith a delay of zero and integrating over interval T. The normalizationcircuit 220 outputs a normalized autocorrelation signal d(t), which isequal to:

$\begin{matrix}{{d(t)} = \frac{R_{m}\left( {t,\tau} \right)}{R_{m}\left( {t,0} \right)}} & (7)\end{matrix}$

In the special case where the auxiliary signal is in the form of binarydata, the information symbols can be recovered by determining the sign(+ or −) of R_(m)(t,τ) at the individual sampled symbol intervals, andthus it would be unnecessary to calculate the zero delay autocorrelationand the normalized autocorrelation signal.

The auxiliary information signal is obtained from the normalizedautocorrelation signal by the auxiliary signal extraction circuit 22. Inthe absence of signal distortion, d(t) has values at discrete points intime separated by T_(s) that are directly proportional to the magnitudeof the input symbols. Signal extraction may be performed by one or morewell known techniques in the art of digital communications, such asfiltering, masking, equalization, synchronization, sampling, thresholdcomparison, and error control coding functions. Such techniques beingwell known, they will not be further elaborated upon.

According to a second implementation, each auxiliary data symbol may beassociated with a set of short-term autocorrelation values, theparticular set being chosen so as to minimize the value of g based uponthe value of the auxiliary data symbol. As an example, for abinary-valued auxiliary signal, the bit transmitted at time iT_(s) isassociated with the set of autocorrelation values 2jεR_(m)(iT_(s),0) forj=0, ±1, ±2, . . . etc. if it is a “1”, or the set(2j−1)εR_(m)(iT_(s),0) for j=0, ±1, ±2, . . . etc. if it is a “0”. Thevalue of j for each bit is selected to minimize the magnitude of gobtained through solution of equations (4a) or (4b). Alternatively,approximate calculation can be performed by using equations (5a) or (5b)if j is chosen so that the value is nearest to R(t,τ). In thisembodiment, the decoder operates in the same way as in the firstimplementation, except that multiple autocorrelation values are mappedto the same auxiliary information symbol.

According to a third implementation, the auxiliary information symbolsare encoded as a difference in short-term autocorrelation functions atpredefined time instances. For example, the symbol interval is dividedinto two equal parts and the autocorrelation function is determined foreach part. The difference between the two autocorrelation functions isthen changed so as to represent the auxiliary data. If the data symbolat iT_(s) is d_(i)ε{±1, ±3, . . . ±(2M−1)}, for i=1, 2, 3, . . . , thenthe desired difference can be expressed by

R _(m)(iT _(s),τ)−R _(m)((i+0.5)T _(s),τ)=ξd _(i) R _(m)(iT _(s),0)  (8)

where ε is a small quantity determined to balance therobustness/transparency requirements. Substituting equations (4a) or(4b) into equation (8) produces a quadratic equation for g which can besolved to obtain the value of g which is applied to the host modifyingsignal in the first half of the symbol interval. Gain equal in magnitudebut opposite in sign (polarity) is applied to the host modifying signalin the second half of the symbol interval. To minimize intersymbolinterference the integration interval should be shorter than(T_(s)/2)−τ. A small amount of interference may be tolerated to obtainan increase in bit rate.

According to another implementation, the host modifying signal iscomposed of a sum of multiple auxiliary information signal components,obtained according to the encoder shown in FIG. 5. Here, a plurality offilter/mask 110 a-110 m provide a plurality of host signals to aplurality of host modifying component generators 111 a-111 m, which areadded together in adders 13, 13 a, etc. to produce a host modifyingsignal 8 a. In this embodiment, M auxiliary signal components aregenerated by using differing amounts of delay in each of the componentgenerators. The auxiliary signals 6 a-6 m can each be different, or maybe the same in order to increase robustness and security level. Arestriction is that for any two component generators having equalamounts of delay, and appearing in the same or overlapping frequencybands, time intervals or spatial masks, the auxiliary signals must bethe same. In this instance the preferred host modifying signals take theform:

e(t)=Σg _(m) s(t−τ _(m))  (9)

where τ_(m) and g_(m) represent the delay and gain for the mth hostmodifying symbol component. By substituting equation (9) into equation(2), the following is obtained:

$\begin{matrix}{{R_{m}\left( {t,\tau} \right)} = {{R\left( {t,\tau} \right)} + {\sum\limits_{m = 1}^{M}{g_{m}\left( {{R\left( {t,{\tau_{m} + \tau}} \right)} + {R\left( {{t - \tau},{\tau_{m} - \tau}} \right)}} \right)}} + {\sum\limits_{{m\; 1} = 1}^{M}{\sum\limits_{{m\; 2} = 1}^{M}{g_{m\; 1}g_{m\; 2}{R\left( {{t - \tau_{m\; 1}},{\tau + \tau_{m\; 2} - \tau_{m\; 1}}} \right)}}}}}} & (10)\end{matrix}$

For a random signal s(t), and sufficiently large τ, R(t,τ) is muchsmaller than R(t,0). Therefore the set of delays {τ_(m)} should bechosen such that R_(m)(t,τ) calculated for τ=±τ_(m) according toequation (10) has only one term for which the short-term autocorrelationdelay is equal to zero. This term will have dominant effect on themodulation of the R_(m)(t,τ_(m)). As different τ_(m) are chosen,different terms in equation (10) become dominant in the summation,effectively “tuning” different host modifying components.

The decoder associated with this embodiment is shown in FIG. 8. Thedecoder includes a number of short-term autocorrelation generators 21a-21 n, one for each delay amount for which a host modifying signalcomponent was generated. The generated autocorrelation signals areprocessed together by auxiliary signal extraction circuit 22 and areeither combined to obtain the auxiliary signal or independentlyprocessed to extract a multiplicity of auxiliary information signals.

According to a fifth implementation according to the first embodiment ofthe invention, the host modifying signal components may change theircorresponding autocorrelation delay amounts τ over time according to apredefined delay pattern referred to as “delay hopping.” The security ofthe auxiliary signal is enhanced by maintaining the delay hoppingpattern secret. The hopping pattern can be defined as a list ofconsecutive autocorrelation delays and their duration. An authorizeddecoder needs to know the hopping pattern as well as thefiltering/masking parameters and signaling parameters (symbol durationand other symbol features). Multiple auxiliary signals can be carriedsimultaneously in the host signal if their hopping patterns aredistinct, even if other filtering/masking and signalling parameters arethe same.

The first embodiment of the invention as described above may be modifiedin many ways as would become apparent to those skilled in the art fromreading the present description. For example, in the above descriptionof the first preferred embodiment of the invention, reference has beenmade to the perception of the host signal by a “perceiver.” In thecontext of the invention, a perceiver may be a device such as acomputer, radar detector, or other electrical/electronic device in thecase of host signal being communication signals, as well as a human inthe case of audio or video host signals. Further, the implementation ofthe invention can be carried out using analog circuitry as well asdigital circuitry such as ASICs (Application Specific IntegratedCircuits), general purpose digital signal processors, microprocessorsand equivalent apparatus. Further, it is possible for thecharacteristics of the filter/mask to change over time according to apredefined pattern which may have characteristic changes of varyingduration. Finally, it is noted that a function similar to that of thepresent invention may be obtained under some circumstances usingtransform-domain processing techniques (such as Fourier or cepstraldomain) which may be implemented using known algorithms such as the FastFourier Transform or FFT.

Second Embodiment

Referring to FIG. 9, according to a second preferred embodiment, theinvention employs an embeddor 10 a to generate a stego signal 4 a, whichis substantially the same in terms of the content and quality ofinformation carried by a cover signal 2. For instance, where coversignal 2 is a video or audio signal, the stego signal 4 a will produceessentially the same video or audio program or information when appliedto an output device such as a video display or loudspeaker.

A stego key 9 is used to determine and specify the particular region ofthe time, frequency and/or space domain of the cover signal 2 where thedigital data 6 is to be embedded, as well as the distributed feature ofthe cover signal to be modified and the grid or table correlatingdigital data values with distributed feature quantization levels. Forexample, in the case of an audio signal, a particular frequency band andtime interval define a region for embedding a data symbol. For a videosignal, an embedding region is specified by a frequency band, a timeinterval in the form of an image field, frame or series of frames, and aparticular area within the field or frame. FIG. 13 shows an example ofthe stego key specifications for frequency band, time interval,distributed signal feature, and symbol quantization grid, for an audiocover signal. Specific examples of distributed signal features areprovided below.

The embeddor then appropriately modulates or modifies the cover signal 2to obtain a stego signal 4 a. Stego signal 4 a can be transmitted, orstored in a storage medium such as magnetic tape, CD-ROM, solid statememory, and the like for later recall and/or transmission. The embeddeddigital data is recovered by an extractor 20 a, having knowledge of oraccess to the stego key 9, which operates on the stego signal 4 a toextract the digital data 6.

FIG. 10 shows a block diagram of one embodiment of the embeddor 10 a. Asshown, the cover signal 2, stego key 9, and digital data 6 are inputtedto an embedded signal generator 11 a. The embedded signal generatormodulates or modifies a predefined distributed feature of the coversignal 2 in accordance with the stego key 9 and digital data 6, andgenerates an embedded signal 8 a. The cover signal 2 is then modified byadding the embedded signal 8 a to the cover signal in an adder 12, toproduce the stego signal 4 a.

FIG. 11 illustrates the details of an embedded signal generator 11 aused to generate a single embedded data signal. The cover signal 2 isfiltered and/or masked in filtering/masking block 30 to produce afiltered/masked signal 31. The filtered/masked signal 31 is comprised ofthe selected regions of the cover signal, as specified by stego key 9,which are then used for embedding of data symbols. The signal 31 is theninputted to a feature extraction block 32, where the distributed featureto be modified, as specified by stego key 9, is extracted and providedto modulation parameter calculation module 34. Module 34 receivesdigital data 6 to be embedded in the cover signal, and determines theamount of modulation of the feature necessary to cause the feature tobecome approximately equal to the quantization value which correspondsto the digital data symbol or bit to be embedded. The calculation result7 is then applied to modulation module 36, which modifies the filteredsignal 31 to obtain the appropriate embedded signal component 8. Theembedded signal component 8 is then added to the cover signal in adder12 as shown in FIG. 10, to obtain the stego signal 4 a.

It is further possible to embed multiple digital data signals in thecover signal 2, by using multiple embedded signal generators, each usinga different stego key to modify a different feature of the cover signaland/or to use different regions of the cover signal, so as to producemultiple embedded signal components each of which are added to the coversignal 2. Alternatively, the different data signals may be embedded in acascade fashion, with the output of one embeddor becoming the input ofanother embeddor using a different stego key.

According to an alternate embodiment, the filtering/masking module 30may be eliminated. In this case, the cover signal is directly modifiedby the embedded signal generator to produce the stego signal.Accordingly, the adder 12 of FIG. 10 would not be required in thisalternate embodiment.

A block diagram of an extractor 20 a used to recover the digital dataembedded in the stego signal is shown in FIG. 12. The stego signal isfiltered/masked in filter/mask module 30 a to isolate the regions wherethe digital data is embedded. The filtered signal 31 a is inputted tofeature extraction module 32 a where the feature is extracted. Theextracted feature 33 a is then inputted to data recovery module 40 wherethe extracted feature is mapped to the quantization table or gridcorrelating quantized feature values with specific data symbols. Amultiplicity of extracted data symbols is then subjected to well-knownerror detection, error correction, and synchronization techniques toverify the existence of an actual message and proper interpretation ofthe content of the message. Specific examples of cover signaldistributed feature modulation to embed data are given hereinafter.

First Example

In this example, the cover signal 2 is an audio signal. In thisembodiment, the audio signal is first filtered to isolate a specificfrequency band to be used for embedding a particular data message, toproduce a filtered audio signal s(t). Other frequency bands can be usedto embed other messages, either concurrently or in a cascaded processingtechnique. In addition, restricting the frequency band to be modulatedto only a fraction of the overall signal spectrum reduces the effect ofsuch modulation on the host or cover signal. The filtering step may beomitted, however, without affecting either the efficiency of theembedding process or the robustness of the embedded data.

Next, a function f(s(t)) of the filtered audio signal s(t) is calculatedas follows:

f(s(t))=abs^(α)(s(t))  (11)

where abs( ) denotes an absolute value calculation, and α is aparameter. Systems using α=1 and α=0.5 have been successfullyimplemented by the present inventors.

Next, the function f(s(t)) is integrated over successive time intervalsof length T to obtain:

$\begin{matrix}{I_{i} = {\int_{{({i - 1})}T}^{iT}{{f\left( {s(t)} \right)}\ {t}}}} & (12)\end{matrix}$

where the interval T corresponds to the duration of a symbol.

In the fourth step, the distributed feature F_(i) for the i-th symbol iscalculated according to the following:

$\begin{matrix}{F_{i} = \frac{I_{i}}{\sum\limits_{n = 1}^{N}{I_{i - n}\left( {1 + g_{j - n}} \right)}^{\alpha}}} & (13)\end{matrix}$

where g_(j), j=1, 2, . . . , N are gain values calculated for N previoussymbols, as shown below.

In the next step, the feature value F_(i) is compared to a set ofquantization levels belonging to a particular symbol, as defined by thestego key 9. The quantization level nearest to F_(i) is determined. Forexample, in the case of binary digits, there are two sets, Q₀ and Q₁,corresponding to bits “0” and “1” respectively. The set of quantizationlevels for each set Q₀ and Q₁ are defined as:

Q ₀ =q(2κε), κ=0,1,2, . . .

Q ₁ =q((2κ+1)ε), κ=0,1,2, . . .  (14)

where ε is the quantization interval that determines therobustness/transparency tradeoff, while q(x) is a monotonic function.Systems using q(x)=x and q(x)=log(x) have been successfully implemented.

Next, the gain value g_(i) to be applied in the i-th symbol interval iscalculated according to:

g _(i)=(Q _(i) /F _(i))^(1/α)−1  (15)

where Q_(i) is the nearest element of the quantization set belonging tothe i-th symbol.

In the following step, the gain g_(i) is applied to all signalamplitudes in the i-th symbol interval and the result is added back intothe audio cover signal. Alternatively, this gain can be applied fullyonly in the middle portion of the symbol interval, and being tapered offtoward the ends of the symbol interval. This approach reduces perceptionof the signal modification at the expense of a slight reduction insymbol robustness.

In order to extract the embedded data, the extractor first filters thestego signal in the same manner as the embeddor, which is defined by thestego key 9. Next, the feature is calculated according to equations (11)to (13), where it is assumed that the time interval T is known inadvance as specified by the stego key 9, and the beginning of theembedded message coincides with the start of the extracting process.

In the next step, the embedded data symbols are extracted by mapping thecalculated feature values to the quantization table or grid as definedby equation (14) (provided by the stego key 9), finding the closestmatch, and translating the quantization value into the correspondingsymbol.

In the following step, consecutive extracted symbols are strung togetherand compared with a set of possible messages. If a match is found, themessage is outputted to a user, or to a higher data protocol layer. Ifno match is found, repeated attempts at extraction are performed, byslightly shifting the starting time of the message by dT, which is asmall fraction of the interval T (e.g., 0.01T to 0.1T).

Second Example

In this example, after a filtering/masking step similar to the firstexample, a function f(s(t)) of the filtered audio signal s(t) iscalculated according to the following:

f(s(t))=s ^(2m)(t)  (16)

where m is an integer. Systems using m=1 and m=2 have been successfullyimplemented.

Next, two integrals are respectively generated over the first half andthe second half of the i-th symbol interval:

$\begin{matrix}{{I_{1,i} = {\int_{{({i - 1})}T}^{{({i - 0.5})}T}{{f\left( {s(t)} \right)}\ {t}}}},{I_{2,1} = {\int_{{({i - 0.5})}T}^{iT}{{f\left( {s(t)} \right)}\ {t}}}}} & (17)\end{matrix}$

In the following step, the distributed feature F_(i) for the i-th symbolis calculated according to:

$\begin{matrix}{F_{i} = \frac{I_{1,i} - I_{2,1}}{I_{1,i} + I_{2,1}}} & (18)\end{matrix}$

Next, the calculated feature F_(i) is compared to a predefined set ofquantization values for the given symbol to be embedded, and the nearestquantization value is chosen. In this embodiment, the sets Q₀ and Q₁ ofquantization values for binary digit symbols “0” and “1” are defined as:

Q ₀ =q((2κ+0.5)ε), κ=0,±1,±2, . . .

Q ₁ =q((2κ−0.5)ε), κ=0,±1,±2, . . .  (19)

where ε is the quantization interval that determines therobustness/transparency tradeoff, while q(x) is a monotonic function.Successful implementations have been performed for q(x)=x andq(x)=x+ε/2.

In the next step the gain g_(i) to be applied in the i-th symbolinterval is calculated according to:

$\begin{matrix}{g_{i} \approx {\frac{1}{2\; m}\frac{Q_{i} - F_{i}}{1 - {Q_{i}F_{i}}}}} & (20)\end{matrix}$

where Q_(i) is the nearest element of the quantization set belonging tothe i-th symbol. Equation (20) is derived as an approximation that holdswell for small values of g_(i) and reduces the amount of computationwith respect to an exact formula, with negligible effects on systemrobustness.

Next, the calculated gain g_(i) is applied to all signal amplitudes inthe i-th symbol interval and the result is added back into the coversignal. Alternatively, the gain is applied fully only in the middleportion of the interval, and is tapered toward the ends of the interval.

The extractor process follows an analogous sequence to that describedabove for the first example.

Third Embodiment

The third embodiment of the invention is directed to a method andapparatus for embedding information or data onto a cover signal, such asan audio signal, video signal, or other analog signal (hereinaftercalled a “cover signal”), by generating a replica of the cover signalwithin a predefined frequency, time and/or space domain, modulating thereplica with an auxiliary signal representing the information to beadded to the cover signal, and then inserting the modulated replica backinto the cover signal. The invention can implemented in a number ofdifferent ways, either by software programming of a digital processor,in the form of analog, digital, or mixed-signal integrated circuits, asa discrete component electronic device, or a combination of suchimplementations. The replica is similar to the cover signal in time andfrequency domain content, but different in certain parameters asspecified by a stego key, which is not generally known, but which isknown at authorized receiving apparatus.

According to this embodiment of present invention, a replica of thecover signal 2 itself (see FIGS. 9 and 10) is used as a carrier for theauxiliary signal 6. Because the replica is inherently similar to thecover signal in terms of frequency content, no analysis of the coversignal is necessary in order to hide an auxiliary signal, such as adigital watermark.

In contrast, according to the prior art techniques discussed above,auxiliary signals are embedded in the form of a pseudorandom sequence(Preuss et al.) or in the form of multiple tones distributed over thefrequency band of the cover signal (Jensen et al.). In order to “hide”such signals so that they are perceptively transparent, it was necessaryto perform an analysis of the cover signal in the frequency domain tomake the watermark signal imperceptible to the observer. Such analysisis based on the phenomenon that human perception will not detect asmaller signal in the presence of a larger signal if the two signals aresufficiently similar. This phenomenon is usually known as the maskingeffect.

The embedded signal 8 according to the present embodiment can beexpressed by the formula:

w _(i)(t)=g _(i) m _(i)(t)r _(i)(t)  (21)

where g_(i)<1 is a gain (scaling factor) parameter determined bytradeoff considerations of robustness versus transparency, m_(i)(t) isthe auxiliary signal 6, wherein |m_(i)(t)|≦1, and r_(i)(t) is a replicaof the cover signal 2. The gain factor g_(i) can be a predeterminedconstant for a given application, or it can be adaptable, such thatdynamic changes in transparency and robustness conditions can be takeninto account. For example, in highly tonal musical passages the gainscan be lower, while for spectrally rich or noisy audio signals the gainscan be higher, with equivalent levels of transparency. In an alternateembodiment, the embeddor can perform an extractor process simulation toidentify signals having less than desirable detectability, and increasethe gain accordingly.

According to this embodiment, as shown in FIG. 10, the cover signal 2,stego key 9, and auxiliary signal (digital data) 6 are inputted toembedded signal generator 11 a, which generates replica r_(i)(t) fromcover signal 2 according to the stego key 9, modulates or modifies thereplica r_(i)(t) with auxiliary signal 6 (m_(i)(t)), scales the resultusing gain parameter g_(i), and generates an embedded signal 8 a(w_(i)(t)). The embedded signal 8 a is then added to the cover signal 2(s(t)) in adder 12, to produce the stego signal 4 a (s(t)).

The replica r_(i)(t) is obtained by taking a portion of the cover signal2 within a specified time, frequency and/or spatial domain as specifiedby the stego key 9, and then making slight modifications to the signalportion, also as specified by the stego key 9. The modifications to thesignal portion need to be small to ensure that the replica remainssimilar to the cover signal as judged by the humanpsychoacoustic-psychovisual systems, but such modifications must belarge enough to be detectable by an appropriately designed extractorhaving knowledge of or access to the stego key 9. As will be discussedbelow, a number of different types of modifications have been found tosatisfy these requirements.

Equation (21) reveals that the replica r_(i)(t) is modulated by theauxiliary signal m_(i)(t) according to a process known as productmodulation. Product modulation results in a broadening of the spectrumof the embedded signal proportionally to the spectral width of theauxiliary signal. In order to make the spectrum of the embedded signalsimilar to the spectrum of the cover signal (to preserve thetransparency of the embedding process) the spectrum of the auxiliarysignal must be narrow in comparison with the lowest frequency in thespectrum of the replica. This requirement imposes a limit on thecapacity of the auxiliary channel, and dictates that low frequencycomponents of the cover signal are unsuitable for inclusion in thecreation of the replica.

In the preferred embodiment of the invention, the modulating signal(auxiliary signal) m(t) is a binary data signal defined by the formula:

$\begin{matrix}{{m(t)} = {\sum\limits_{n = 1}^{N}{b_{n}{h\left( {t - {nT}} \right)}}}} & (22)\end{matrix}$

where N is the number of binary digits or bits in the message,b_(n)ε(−1, 1) is the n-th bit value, T is the bit interval, and h(t)represents the shape of the pulse representing the bit. Typically, h(t)is obtained by low-pass filtering a rectangular pulse so as to restrictthe spectral width of the modulating (auxiliary) signal.

FIG. 14 illustrates the details of an embedded signal generator 11 aused to generate a single embedded data message according to thisembodiment. The cover signal 2 is filtered and/or masked infiltering/masking block 30 to produce a filtered/masked signal 31. Thefilter/mask block 30 separates regions of the cover signal used fordifferent embedded messages. For example, the filter/mask block mayseparate the frequency band region 1000-3000 Hz from the cover signal inthe frequency domain, may separate the time interval region t=10 secondsto t=30 seconds from the cover signal in the time domain, or mayseparate the upper right spatial quadrant region of the cover signal inthe spatial domain (such as where the cover signal is an MPEG, JPEG orequivalent signal) which separated region would then be used forauxiliary signal embedding.

The filtered/masked signal 31 is comprised of the selected regions ofthe cover signal, as specified by stego key 9, which are then used forcreation of the replica signal 1441. The signal 31 is then inputted to areplica creator 1440, where predetermined parameters of the signal aremodified, as specified by stego key 9, to create the replica r_(i)(t)1441. The replica 1441 is then modulated by the auxiliary signalm_(i)(t) in multiplier 1442 a, and the resultant signal is then scaledin multiplier 1442 b according to the selected gain factor g_(i) toproduce embedded signal component 8 (i.e., w_(i)(t) in equation (21)).The embedded signal component 8 is then added back to the cover signal 2in adder 12 (FIG. 10) to obtain the stego signal 4. In order to maintainsynchronization between the cover signal 2 and the embedded signalcomponent 8, inherent processing delays present in the filter/mask block30 and replica creator block 1440 are compensated for by adding anequivalent delay in the cover signal circuit path (between the coversignal input and the adder 12) shown in FIG. 10.

It is further possible to embed multiple auxiliary data signals in thecover signal 2, by using multiple embedded signal generators, each usinga different stego key to modify a different feature of the cover signaland/or to use different regions of the cover signal, so as to producemultiple embedded signal components each of which are added to the coversignal 2. Alternatively, the different data signals may be embedded in acascade fashion, with the output of one embeddor becoming the input ofanother embeddor using a different stego key. In either alternativeinterference between embedded signal components must be minimized. Thiscan be accomplished by using non-overlapping frequency, time or spaceregions of the signal, or by selecting appropriate replica creationparameters, as disclosed below.

A block diagram of an extractor used to recover the auxiliary dataembedded in the stego signal is shown in FIG. 15. The stego signal 4 isfiltered/masked in filter/mask module 30 a to isolate the regions wherethe auxiliary data is embedded. The filtered signal 31 a is inputted toreplica creator 1440 a where a replica r_(i)(t) 1441 a of the stegosignal is generated in the same manner as the replica r_(i)(t) of thecover signal in the replica creator block 1440 in the embeddor, usingthe same stego key 9. The replica r_(i)(t) of the stego signal 4 can beexpressed by the formula:

$\begin{matrix}{{{\overset{\_}{r}}_{i}(t)} = {{{r_{i}(t)} + {\sum\limits_{i}{g_{i}{R\left( {{m_{i}(t)}{r_{i}(t)}} \right)}}}} \approx {r_{i}(t)}}} & (23)\end{matrix}$

where R(m_(i)(t)r_(i)(t)) represents the replica of the modulated coversignal replica. For sufficiently small gain factors g_(i) the replica ofthe stego signal is substantially the same as the replica of the coversignal.

In the extractor 20 a, the replica r_(i)(t) 1441 a is multiplied by thestego signal 31 a in multiplier 1442 c to obtain the correlationproduct:

c(t)= r _(i)(t) s (t)≈r _(j)(t)s(t)+Σg _(i) m _(i)(t)r _(i)(t)r_(j)(t)  (24)

In designing the replica signal, one objective is to obtain spectra ofthe products r_(j)(t)s(t) and r_(i)(t)r_(j)(t), i≠j, with little lowfrequency content. On the other hand, the spectra of the productr_(j)(t)r_(j)(t)=r_(j) ²(t) contains a strong DC component, and thus thecorrelation product c(t) contains a term of the formg_(i)m_(i)(t)mean(r_(j) ²), i.e., c(t) contains the scaled auxiliarysignal m_(i)(t) as a summation term.

In order to extract the auxiliary signal m_(i)(t) from the correlationproduct c(t), filtering is performed on c(t) by filter 1444, which has afilter characteristic matching the spectrum of the auxiliary signal. Forexample, in the case of a binary data signal with a rectangular pulseshape, the matched filtering corresponds to integration over the bitinterval. In the case of digital signaling, the filtering operation isfollowed by symbol regeneration in a regenerator 1446. A multiplicity ofthe extracted data symbols is then subjected to well-known errordetection, error correction, and synchronization techniques to verifythe existence of an actual message and proper interpretation of thecontent of the message.

One preferred embodiment of a replica creator 1440 is shown in FIG. 16.In this embodiment, a replica signal 1441 is obtained by shifting thefrequency of the filtered cover signal 31 by a predetermined offsetfrequency f_(i) as specified by the stego key 9. This shifting processis also known as single sideband amplitude modulation, or frequencytranslation. In addition to the processing shown in FIG. 16, a number ofdifferent techniques known in the art are available to perform thisprocess.

Blocks 1652 and 1654 represent respective phase shifts of the inputsignal s(t). To achieve the desired frequency shift, the relationshipbetween the phase shifts must be defined as:

φ₁(f)−φ₂(f)=90°  (25)

The respective phase-shifted signals are multiplied by sinusoidalsignals with frequency f_(i) in respective multipliers 1656 a and 1656b. Block 1658 denotes a 90° phase shift of the sinusoidal signal appliedto multiplier 1656 b. The resulting signals are then combined in summer1659. Thus, the replica signal 1441 can be expressed as:

r _(i)(t)=s(t,φ ₁)sin(2πf _(i) t)±s(t,φ ₂)cos(2πf _(i) t)  (26)

where s(t, φ_(i)) denotes signal s(t) phase-shifted by φ_(i). The sign −or + in the summation process represents a respective shift up or downby f_(i). According to psychoacoustic models published in theliterature, better masking may be achieved when the shift is upward.Accordingly, in the preferred embodiment subtraction is used in equation(26). In a special case φ₁=90° and φ₂=0°, such that equation (26)becomes:

r _(i)(t)=s _(h)(t)sin(2πf _(i) t)±s(t)cos(2πf _(i) t)  (27)

Where s_(h)(t) is a Hilbert transform of the input signal, defined by:

$\begin{matrix}{{s_{h}(t)} = {{1/\pi}{\int_{- \infty}^{\infty}{\frac{s(x)}{t - x}\ {x}}}}} & (28)\end{matrix}$

The Hilbert transform may be performed in software by various knownalgorithms, with equation (27) being suitable for digital signalprocessing. For analog signal processing, it is easier to design acircuit pair that maintains the 90° relative phase shifts throughout thesignal spectrum, than to perform a Hilbert transform.

The particular frequency offset f_(i) can be chosen from a wide range offrequencies, and specified by the stego key. Multiple auxiliary signalscan be inserted into the same time, frequency and/or space domain of thesame cover signal, by having a different frequency offset value, to thusachieve a “layering” of auxiliary signals and increase auxiliary channelthroughput.

The frequency offset also may be varied in time according to apredefined secret pattern (known as “frequency hopping”), to improve thesecurity of a digital watermark represented by the auxiliaryinformation.

The particular choice of frequency offset values is dependent upon theconditions and parameters of the particular application, and can befurther fine tuned by trial and error. According to experimentalresults, optimal signal robustness in the presence of channel distortionwas achieved where the frequency offset value was larger than themajority of spectrum frequencies of the modulating auxiliary signalm(t). On the other hand, optimal transparency was achieved where thefrequency offset value was substantially smaller than the lowestfrequency of the cover signal. As an example, for audio signal embeddinga cover signal above 500 Hz was used with a frequency offset of 50 Hz,while the modulating signal was a binary data signal with a bit rate of25 bps.

In an alternative embodiment of a replica creator, the replica isgenerated by shifting the phase of the filtered/masked portion 31 of thecover signal by a predetermined amount defined by a function φ_(i)(f)for an i-th embedded signal. In this case, the replica generators 40 and40 a are linear systems having a transfer function defined as:

H _(i)(f)=A _(i) e ^(jφ) ^(i) ^((f))  (29)

Where A_(i) is a constant with respect to frequency, j is the imaginarynumber ✓

, and φ_(i)(f) is the phase characteristic of the system. Circuitsdescribed by equation (29) are known in the art as all-pass filters orphase correctors, and their design is well-known to those skilled in theart.

This embodiment is particularly suitable for auxiliary signal embeddingin audio signals, since the human audio sensory system is substantiallyinsensitive to phase shifts. The functions φ_(i)(f) are defined to meetthe objective that the product of the replica and the cover signalcontain minimal low frequency content. This can be achieved bymaintaining at least a 90° shift for all frequency components in thefiltered/masked signal 31. Multiple embedded messages have beenimplemented with little interference where the phase shift betweenfrequency components of different messages is larger than 90° for themajority of the spectral components. The exact choice of the functionφ_(i)(f) is otherwise governed by considerations of tradeoff betweencost and security. In other words, the function should be complex enoughso that it is difficult for unauthorized persons to determine the signalstructure by analyzing the stego signal, even with the known coversignal, yet it should be computationally inexpensive to implement. Afunction hopping pattern which switches between different functions atpredetermined intervals as part of the stego key can be used to furtherenhance security.

A special class of phase shift functions, defined by

φ_(i)(f)=τ_(i) f  (30)

where τ_(i) is a constant, results in time shift replicas of the coversignal. This class of functions has special properties in terms ofcost/security tradeoff, which are beyond the scope of the presentdisclosure and will not be further treated here.

According to a further alternate embodiment of the invention, thereplica generator obtains the replica signal by amplitude modulation ofthe cover signal. The amplitude modulation can be expressed by theequation

r _(i)(t)=a _(i)(t)s(t)  (31)

where a_(i)(t) is a class of orthogonal functions. FIGS. 17( a)-17(c)illustrate a set of three elementary functions a₁(t), a₂(t), and a₃(t)used to generate amplitude shifted replica signals, with each functionbeing defined over the interval (0,T) where T equals the bit interval ofthe auxiliary signal. Longer replicas are generated by using a string ofelementary functions. Post-correlation filtering in the extractor isperformed by integration over the interval T, and the auxiliary channelbit b_(j,n) is extracted according to the formula

_(j,n)=sign(A_(j,n)), where:

$\begin{matrix}\begin{matrix}{A_{j,n} = {\int_{{({n - 1})}T}^{nT}{{c(t)}\ {t}}}} \\{\approx {{\int_{{({n - 1})}T}^{nT}{{a_{j}(t)}{s^{2}(t)}\ {t}}} + {\sum\limits_{i}{g_{i}{\int_{{({n - 1})}T}^{nT}{{m_{i}(t)}{s^{2}(t)}{a_{i}(t)}{a_{j}(t)}\ {t}}}}}}} \\{\approx {g_{i}{\int_{{({n - 1})}T}^{nT}{{m_{j}(t)}{s^{2}(t)}\ {t}}}}}\end{matrix} & (32)\end{matrix}$

The above approximations hold, since

∫₀^(T)a_(j)(t) t = 0, ∫₀^(T)a_(i)(t)a_(j)(t) t = 0, fori ≠ j, and a_(j)²(t) = 1

As is apparent from equation (32), the sign of A_(j,n) (and the receivedbit value) depends on the sign of m_(j)(t) during the n-th bit interval,or in other words the transmitted bit value. The functions used foramplitude shifting generally should have a small low frequency content,a spectrum below the lowest frequency of the filtered/masked signal, andshould be mutually orthogonal. The particular choice of functionsdepends upon the specific application, and is specified in the stegokey.

According to yet another alternative embodiment, a combination ofdifferent shifts in different domains can be executed simultaneously togenerate a replica signal. For example, a time shift can be combinedwith a frequency shift, or an amplitude shift can be combined with aphase shift. Such a combination shift can further improve the hiding(security) property of the embedding system, and also improvedetectability of the embedded signal by increasing the difference fromthe cover signal.

With respect to security, attacks would be expected that incorporateanalysis designed to reveal the parameters of the stego key. If suchparameters become known, then the embedded signal can be overwritten orobliterated by use of the same stego key. Use of a combination of shiftsmakes such analysis more difficult by enlarging the parameter space.

With respect to detectability, certain naturally occurring signals mayhave a content similar to a replica signal; for example, echo in anaudio signal may produce a phase shifted signal, choral passages in amusical program may produce a frequency shifted signal, and tremolo mayproduce amplitude shifts, which may interfere with embedded signaldetection. Use of a combination of shifts reduces the likelihood that anatural phenomenon will exactly match the parameters of the stego key,and interfere with signal detection.

The invention having been thus described, it will be apparent to thoseskilled in the art that the same may be varied in many ways withoutdeparting from the spirit and scope of the invention. Any and all suchmodifications as would be apparent to those skilled in the art areintended to be covered by the following claims.

What is claimed is:
 1. An apparatus, comprising: a comparator configuredto compare a host content with a predetermined auxiliary informationcarrier, wherein the auxiliary information carrier is based onpredefined attributes of the host content and comprises a plurality ofcomponents having varying amounts of delay or offset from each other; acorrelator configured to correlate the host signal with a particularcomponent of the auxiliary information carrier, and an extractorconfigured to determine a value of an embedded information symbol as afunction of a value of the correlated component and a predefinedrelationship between information symbol values and components of theauxiliary information carrier.
 2. The apparatus of claim 1, wherein thevalue of the embedded information symbol enables identification of anattribute of the host content.
 3. The apparatus of claim 2, wherein theattribute comprises copy management rules associated with the hostcontent.
 4. The apparatus of claim 2, wherein the attribute comprisesusage control rules associated with the host content.
 5. The apparatusof claim 2, wherein the attribute comprises distribution control rulesassociated with the host content.
 6. A non-transitory computer-readablestorage medium with a host content embodied thereupon, the host contentcomprising: one or more auxiliary information symbols that areimperceptibly embedded in the host content, wherein reception of thehost content at a content handling device equipped with a watermarkextractor triggers the watermark extractor to: compare the host contentwith a predetermined auxiliary information carrier, wherein theauxiliary information carrier is based on predefined attributes of thehost content and comprises a plurality of components having varyingamounts of delay or offset from each other; correlate the host signalwith a particular signal component of the auxiliary information carrier;and determine a value of an embedded information symbol as a function ofthe value of the correlated component and a predefined relationshipbetween information symbol values and components of the auxiliaryinformation carrier.